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Abstract 

Frame permutation quantization (FPQ) is a new vector quantization technique using finite frames. In FPQ, 
a vector is encoded using a permutation source code to quantize its frame expansion. This means that 
the encoding is a partial ordering of the frame expansion coefficients. Compared to ordinary permutation 
source coding, FPQ produces a greater number of possible quantization rates and a higher maximum rate. 
Various representations for the partitions induced by FPQ are presented, and reconstruction algorithms 
based on linear programming, quadratic programming, and recursive orthogonal projection are derived. 
Implementations of the linear and quadratic programming algorithms for uniform and Gaussian sources 
show performance improvements over entropy-constrained scalar quantization for certain combinations of 
vector dimension and coding rate. Monte Carlo evaluation of the recursive algorithm shows that mean- 
squared error (MSE) decays as AI~* for an M-element frame, which is consistent with previous results 
on optimal decay of MSE. Reconstruction using the canonical dual frame is also studied, and several 
results relate properties of the analysis frame to whether linear reconstruction techniques provide consistent 
reconstructions. 

Keywords: dual frame, consistent reconstruction, frame expansions, linear programming, partial orders, 
permutation source codes, quadratic programming, recursive estimation, vector quantization 



1. Introduction 

Redundant representations obtained with frames are playing an ever-expanding role in signal processing 
due to design flexibility and other desirable properties [1, 2]. One such favorable property is robustness to 
additive noise [3]. This robustness, carried over to quantization noise (without regard to whether it is ran- 
dom or signal- independent), explains the success of both ordinary oversampled analog-to-digital conversion 
(ADC) and E-A ADC with the canonical linear reconstruction. But the combination of frame expansions 
with scalar quantization is considerably more interesting and intricate because boundedness of quantization 
noise can be exploited in reconstruction [4, 5, 6, 7, 8, 9, 10, 11, 12] and frames and quantizers can be designed 
jointly to obtain favorable performance [13]. 

This paper introduces a new use of finite frames in vector quantization: frame permutation quantization 
(FPQ). In FPQ, permutation source coding (PSC) [14, 15] is applied to a frame expansion of a vector. This 
means that the vector is represented by a partial ordering of the frame coefRcients (Variant I) or by signs 
of the frame coefRcients that are larger than some threshold along with a partial ordering of the absolute 
values of the significant coefficients (Variant II). FPQ provides a space partitioning that can be combined 
with additional signal constraints or prior knowledge to generate a variety of vector quantizers. 
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Beyond the explication of the basic ideas in FPQ, the focus of this paper is on how — in analogy to works 

cited above — there are several decoding procedures that can sensibly be used with the encoding of FPQ. 
First, we consider using the ordinary PSC decoding for the frame coefficients followed by linear synthesis 
with the canonical dual; from the perspective of frame theory, this is the natural way to reconstruct. For 
this, we find conditions on the frame used in FPQ that relate to whether the canonical reconstruction is 
consistent. Second, taking a geometric approach based on imposing consistency yields instead optimization- 
based algorithms. Third, algorithms with lower complexity can have similar performance by recursively 
imposing consistency only locally [8, 12]. 

There are two distinct ways to measure the performance of FPQ, and these correspond to different 
potential uses of FPQ: data compression and data acquisition. The accuracy of signal representation — here 
measured by mean-squared error (MSE) — is important in either case. For data compression, accuracy is 
traded off against a coding rate (bits per sample). The standard alternative is scalar quantization, and 
for low delay and complexity, one considers moderate signal dimensions. It is remarkable that introducing 
redundancy through a frame expansion can improve compression, and we find that it does so only when 
the redundancy is low. For data acquisition, accuracy is traded off against the number of samples collected 
(number of frame elements). Sensors that operate at low power and high speed by outputting orderings of 
signal levels rather than absolute levels have been demonstrated and are a subject of renewed interest [16, 17]. 
By showing that the MSE can decay quickly as a function of the number of samples collected, we may 
encourage the further development of such sensors. Here computational complexity of reconstruction is 
more important because the data are recoded prior to storage or transmission. This is in close analogy 
to ovcrsampling in analog-to-digital conversion, which is ubiquitous even though it is not advantageous in 
terms of accuracy as a function of bit rate unless there is recoding at or near Nyquist rate. Note also that 
for both historical and practical reasons, data compression is typically studied for random vectors while 
data acquisition is studied for nonrandom vectors within some bounded set. This paper mixes Bayesian and 
non-Bayesian formulations accordingly. 

The paper is organized as follows: Before formal introduction to frame expansions, permutation source 
codes, or their combination. Section 2 provides a preview of the geometry of FPQ. This serves both to 
contrast with ordinary scalar-quantized frame expansions and to see the effect of frame redundancy. Sec- 
tion 3 provides the requisite background by reviewing PSCs, frames, and scalar-quantized frame expansions. 
Section 4 formally defines FPQ, emphasizing constraints that are implied by the representation and hence 
must be satisfied for consistent reconstruction. Section 4 also provides reconstruction algorithms based on 
applying the constraints for consistent reconstruction globally or locally. The results on choices of frames 
in FPQ appear in Section 5. These are necessary and sufficient conditions on frames for linear reconstruc- 
tions to be consistent. Section 6 provides numerical results that demonstrate improvement in operational 
distortion-rate compared to ordinary PSC and optimal decay of distortion as a function of the number of 
samples. Proofs of the main results are given in Section 7. Preliminary results on FPQ were mentioned 
briefiy in [18]. 

2. Preview through Geometry 

Consider the quantization of a; e M^, where we restrict attention to A'' = 2 in this section but later allow 
any finite N. The uniform scalar quantization of x partitions in a trivial way, as shown in Fig. l(rt). 
(An arbitrary segment of the plane is shown.) If over a domain of interest each component is divided into 
K intervals, a partition with cells is obtained. 

A way to increase the number of partition cells without increasing the scalar quantization resolution is 
to use a frame expansion. A conventional quantized frame expansion is obtained by scalar quantization of 
y = Fx, where F € '^mxn .^j^j^ M > N . Keeping the resolution K fixed, the partition now has cells. 
An example with M = 6 is shown in Fig. 1(d). Each frame element 4)k (transpose of row of F) induces 
a hyperplane wave partition [19]: a partition formed by equally-spaced {N — 1) -dimensional hyperplanes 
normal to 0^. The overall partition has M hyperplane waves and is spatially uniform. A spatial shift 
invariance can be ensured formally by the use of subtractively dithered quantizers [20]. 




(a) Scalar quantization (b) Permutation source code (Var. I) (c) Permutation source code (Var. II) 




(d) Scalar-quantized frame expansion (e) Frame permutation quantizer (Var. I) (f) Frame permutation quantizer (Var. II) 

Figure 1: Partition diagrams for a; G K^. (a) Scalar quantization, (b) Permutation source code, Variant I. (c) Permutation 
source code, Variant II. (Both permutation source codes have ni = n2 = 1.) (d) Scalar-quantized frame expansion with M = 6 
coefficients (real harmonic tight frame), (e) Frame permutation quantizer, Variant I. (f) Frame periimtation quantizer, Variant 
II. (Both frame permutation quantizers have M = 6, mi = m2 = • • • = me = 1, and the same random frame.) 

A Variant I PSC represents x just by which permutation of the components of x puts the components 
in descending order. In other words, only whether xi > X2 or X2 > Xi is specified."*^ The resulting partition 
is shown in Fig. 1(b). A Variant II PSC specifies (at most) the signs of the components of Xi and X2 and 
whether > \x2\ or \x2\ > \xi\. The corresponding partitioning of the plane is shown in Fig. 1(c), with 
the vertical line coming from the sign of Xi, the horizontal line coming from the sign of X2, and the diagonal 
lines from \xi \ ^ \x2\- 

While low-dimensional diagrams are often inadequate in explaining PSC, several key properties arc 
illustrated. The partition cells are (unbounded) convex cones, giving special significance to the origin and 
a lack of spatial shift invariance. The unboundedness of cells implies that some additional knowledge, such 
as a bound on ||a;|| or a probabilistic distribution on x, is needed to compute good estimates. At first this 
may seem extremely different from ordinary scalar quantization or scalar-quantized frame expansions, but 
those techniques also require some prior knowledge to allow the quantizer outputs to be represented with 
finite numbers of bits. We also see that the dimension N determines the maximum number of cells {N\ for 
Variant I and 2^ A''! for Variant II); there is no parameter analogous to scalar quantization step size that 
allows arbitrary control of the resolution. 

To get a finer partition without changing the dimension A, we can again employ a frame expansion. 
With y = Fx as before, PSC of y gives more relative orderings with which to represent x. If 0j and (f>k are 
frame elements (transposes of rows of F) then {x, 4>j) ^ {x, <j)k) is {x, (pj — (j)k) ^ by linearity of the inner 
product, so every pair of frame elements can give a condition on x. An example of a partition obtained with 
Variant I and M = 6 is shown in Fig. 1(e). There are many more cells than in Fig. 1(b). Similarly, Fig. 1(f) 



-'^The boundary case of xi = X2 can be handled arbitrarily in practice and safely ignored in the analysis. When the source 
vector has an absolutely continuous distribution, the boundary affects neither the rate nor the distortion. For an optimal 
quantizer, the boundaries will have zero probability even if the source has a discrete component [21, p. 355]. 



shows a Variant II example. The cells are still (unbounded) convex cones. If additional information such as 
1 1 a; 1 1 or an afSne subspace constraint (not passing through the origin) is known, x can be specified arbitrarily 
closely by increasing M . 



3. Background 

Having illustrated the basic idea of PSC and our generalization using frames to provide resolution control, 
we now formalize the background material. We assume throughout fixed-rate coding and the conventional 
squared-error fidelity criterion — x|p between source x and reproduction x. Some statements — especially 
those pertaining to data compression — assume a known source distribution over which performance is mea- 
sured in expectation. Most statements for data acquisition with M ^ oo apply pointwise over x. 

3.1. Vector Quantization 

A vector quantizer is a mapping from an input x G M.^ to a codeword x from a finite codebook C. Without 
loss of generality, a vector quantizer can be seen as the combination of an encoder 

a : ^ I 

and a decoder 

where I is a finite index set. The encoder partitions into \I\ regions or cells {a~^{i)}i^x, and the decoder 
assigns a reproduction value to each cell. Examples of partitions are given in Fig. 1. For the quantizer to 
output R bits per component, we have \I\ — 2^^. 

For any codebook (i.e.. any 3), the encoder a that minimizes \\x — maps x to the nearest element 
of the codebook. The partition is thus composed of convex cells. Since the cells are convex, reproduction 
values are optimally within the corresponding cells — whether to minimize mean-squared error distortion, 
maximum sqTiarcd error, or some other reasonable function of sqiiared error. To minimize maximum squared 
error, reproduction values should be at centers of cells; to minimize expected distortion, they should be at 
centroids of cells. Reproduction values being within corresponding cells is formalized as consistency: 

Definition 3.1. The reconstruction x = P{a{x)) is called a consistent reconstruction of x when a{x) = a{x) 
(or equivalently p{a{x)) = x). The decoder /? is called consistent when (3{a{x)) is a consistent reconstruction 
of X for all x. 

In practice, the pair (a, 13) usually does not minimize any desired distortion criterion for a given codebook 

size because the optimal mappings are hard to design and hard to implement [21]. The mappings are 
commonly designed subject to certain structural constraints, and P may not even be consistent for a [4, 6]. 

3.2. Permutation Source Codes 

A permutation source code is a vector quantizer with the defining characteristic that codewords are 
related through permutations and, possibly, sign changes. Permutation codes were originally introduced as 
channel codes by Slepian [22]. They were then applied to a specific source coding problem, through the 
duality between source encoding and channel decoding, by Dimn [14] and developed in greater generality 
by Berger et al. [15, 23, 24]. Permutation codes are generated by the group action of a permutation group 
and are thus examples of group codes [25]. 



3.2.1. Definitions 

There arc two variants of permutation codes: 

Variant I: Here codewords are related through permutations, without sign changes. Let /xi > /X2 > • • • > 
I^K be real numbers, and let ni,n2, . . . ,nK be positive integers that sum to N (an (ordered) composition of 
N). The initial codeword of the codebook C has the form 

^init = (Ml,---,Ml,M2,---,M2,---,Mi<r,---,Mif), (1) 

< rai > < n2 > < riK >■ 

where each /i^ appears Ui times. When x-mit has this form, we call it compatible with (ni,n2, . . . jUk)- The 
codebook is the set of all distinct permutations of finit- The number of codewords in C is thus given by the 
multinomial coefficient 

N\ 

Li = r. (2a) 

ni! 712! • • • Uk' 

The permutation structure of the codebook enables low-complexity nearest-neighbor encoding [15]: map 
X to the codeword x whose components have the same order as x; in other words, replace the ni largest 
components of x with ^i, the n2 next-largest components of x with ij,2, and so on. 

Variant II: Here codewords arc related through permutations and sign changes. Let /ii > /i2 > • • • > 
> be nonncgative real numbers, and let (ni, n2, . . . , uk) be a composition of N. The initial codeword 
has the same form as in (1), and the codebook now consists of all distinct permutations of ^inn with each 
possible sign for each nonzero component. The number of codewords in C is thus given by 

Ln = 2'^—^. (2b) 

ni! 712! • • • Uk- 

where ft, = TV if hk > and h = N — tik if Hk = 0. 

Nearest-neighbor encoding for Variant II PSCs can be implemented as follows [15]: map x to the codeword 
X whose components have the same order in absolute value and match the signs of corresponding components 
of X. Since the complexity of sorting a vector of length N is 0{N log A^) operations, the encoding complexity 
for either PSC variant is much lower than with an unstructured source code and only 0{logN) times higher 
than scalar quantization. 

With the codebook sizes given in (2), the per-component rate is defined as 

R = N-Hog2L. (3) 

Under certain symmetry conditions on the source distribution, all codewords are equally likely so the rate 
cannot be reduced by entropy coding. This generation of fixed-rate output — avoiding the possibility of buffer 
overflow associated with entropy coding of the highly nonequiprobable outputs of a quantizer [26] — is a known 
advantage of PSCs [15]. An efficient enumeration of permutations, to generate a binary representation, is 
described in [27]. 



3.2.2. Partition Properties 

For both historical reasons and to match the conventional approach to vector quantization, PSCs were 
defined above in terms of a codebook structure, and the codebook structure led to an encoding procedure. 
Note that we may now examine the partitions induced by PSCs separately from the particular codebooks 
for which they are nearest-neighbor partitions. 

The partition induced by a Variant I PSC is completely determined by the composition (ni, n2, . . . , uk)- 
Specifically, the encoding mapping can index the permutation P that places the ui largest components of 
X in the first ni positions (without changing the order within those ni components), the 712 next-largest 
components of x in the next n2 positions, and so on; the /LtjS are actually immaterial. This encoding is 
placing all source vectors x such that Px is n-descending in the same partition cell, defined as follows. 

Definition 3.2. Given a composition n = (ni, n2, . . . , uk) of N, a vector in is called n-descending if 
its ni largest entries are in the first ui positions, its n2 next-largest components are in the next 712 positions, 
etc. 



The property of being n-descending is to be descending up to the arbitrariness specified by the composition 
n. 

Because this is nearest-neighbor encoding for some codebook, the partition cells must be convex. Fur- 
thermore, multiplying x by any nonnegative scalar does not affect the encoding, so the cells are convex 
cones. (This was discussed and illustrated in Section 2.) We develop a convenient representation for the 
partition in Section 4. 

The situation is only slightly more complicated for Variant II PSCs. The partition is determined by the 
composition (ni, n2, . . . , tik) and whether or not the signs of the smallest-magnitude components should be 
encoded (whether = 0, in the codebook-centric view). 

The PSC literature has mostly emphasized the design of PSCs for sources with i.i.d. components. But 
as developed in Section 4, the simple structured encoding of PSCs could be combined with unconventional 
decoding techniques for other sources. The possible suitability of PSCs for sources with unknown or time- 
varying statistics has been previously observed [15]. 



3.2.3. Codebook Optimization 

With the encoding procedure now fixed, let us turn to the decoder (or codebook) design. For this we 
assume that x is random and that the components of x are i.i.d. 

Let ^1 > ^2 > • • • > ^Af denote the order statistics of random vector x = (xi, . . . ,xn) and r?i > 772 > 
• • • > Vn denote the order statistics of random vector = (|a;i|,..., |a;Ar|).^ For a given initial codeword 
^init, the per-letter distortion of optimally-encoded Variant I and Variant II PSCs are given by 



and 



Di = N-^E 



Dii = N-^E 



.i=l teli 



\2 



\2 



where IjS are the sets of indexes generated by the composition: 
Ii = {1,2, ...,ni}. 



i > 2. 



(4a) 



(4b) 



(5a) 
(5b) 



These distortions can be deduced simply by examining which components of x are mapped to which elements 

of Xinit- 

Optimization of (4a) and (4b) over both {ni}fL^ and {fM}iLi subject to (3) is difficult, partly due to the 
integer constraint of the composition. However, given a composition (rii, n2, . . . , n^f ), the optimal initial 
codeword can be determined easily from the means of the order statistics. In particular, the optimal {^i}^Li 
of Variant I and Variant II PSCs are given by 



Mi 



-^E 



and 



Mi = ^ E ^ t^'^] 

eeXi 



for Variant I, 



for Variant II. 



(6a) 



(6b) 



The analysis of [23] shows that when N is large, the optimal composition gives performance; equal to 
ECSQ of X. Performance does not strictly improve with increasing N; permutation codes outperform ECSQ 
for certain combinations of block size and rate [29]. 



^For consistency with earlier literature on PSCs, we are reversing the usual sorting of order statistics [28]. 



3.3. Frame Definitions and Classifications 

The theory of finite-dimensional frames is often developed for a Hilbert space of complex vectors. In 
this paper, we use frame expansions only for quantization using PSCs, which rely on order relations of real 
numbers. Therefore we limit ourselves to real finite frames. We maintain the Hermitian transpose notation * 
where a transpose would suffice because this makes several expressions have familiar appearances. 

The Hilbert space of interest is equipped with the standard inner product (dot product) , 

N 



{x, y) = x^y = '^Xkyk, 



fe=i 

for X = [xi,X2, .... .i'nI^ € and y = 2/2, • • • , yjv]^ € . The norm of a vector x is naturally induced 
from the inner product, 

||x|| = V {x, x). 

Definition 3.3 ([3]). A set of N -dimensional vectors, <J? = {(t)k}k=i , is called a frame if there exist 

a lower frame hound, ^ > 0, and an upper frame hound, B < 00, such that 

M 

A\\xf < ^ I (x, (t>k) P < B\\xf, for all xeR^. (7a) 

fc=i 

The matrix F G R^^^ with kth row equal to (jfj. is called the analysis frame operator. F and $ will he used 
interchangeahly to refer to a frame. Equivalent to (7a) in matrix form is 

AIn < F*F < BIn, (7b) 

where In is the N x N identity matrix. 

The lower bound in (7) implies that $ spans ffi.^; thus a frame must have M > N. It is therefore 
reasonable to call the ratio r = M/N the redundancy of the frame. A frame is called a tight frame if the 
frame bounds can be chosen to be equal. A frame is an equal-norm fram,e if all of its vectors have the 
same norm. If an equal-norm frame is normalized to have all vectors of unit norm, we call it a unit-norm 
frame (or sometimes normalized frame or uniform frame). For a unit-norm frame, it is easy to verify that 
A<r<B. Thus, a unit-norm tight frame (UNTF) must satisfy A = r = B and 

F*F = riN. (8) 

Naimark's theorem [30] provides an efficient way to characterize the class of equal-norm tight frames: a 
set of vectors is an equal-norm tight frame if and only if it is the orthogonal projection (up to a scale factor) 
of an orthonormal basis of an ambient Hilbert space on to some subspace.^ As a consequence, deleting 
the last (M — A^) columns of the (normalized) discrete Fourier Transform (DFT) matrix in (C.'^^^ yields a 
particular subclass of UNTFs called (complex) harmonic tight frames (HTFs). One can adapt this derivation 
to construct real HTFs [31], which are always UNTFs, as follows. 

Definition 3.4. The real harmonic tight frame of M vectors in is defined for even N hy 
^1+1 



kir Skir (N — l)fc7r . kn . Skn . (N — l)kn 
cos —, cos —r-r, ■ ■ ■ ,cos — , sm — ,sm — — , . . . ,sm 



and for odd N by 



M M M M M M 



1 2k-K 4fc7r (N - l)kTT . 2kn . Akn . (N - l)kTT 
--, cos —7— , cos —7— , . . . , cos — ,sm -Yi-,sm——, . . . ,sm 



(9a) 



(9b) 



The theorem holds for a general separable Hilbert space of possibly infinite dimension. 



where k = 0,1, . . . , M — 1. The modulated harmonic tight frames are defined by 

V'fe=7(-l)Vfc, fork = l,2,...,M, (10) 
where 7 = 1 or j — —1 (fixed for all k ). 

HTFs can be viewed as the result of a group of orthogonal operators acting on one generating vector [2] . 
This property has been generalized in [32, 33] under the name geometrically-uniform frames (GUFs). Note 
that a GUF is a special case of a group code as developed by Slepian [22, 25]. An interesting connection 
between PSCs and GUFs is that under certain conditions, a PSC codebook is a GUF with generating vector 
Xinit and the generating group action provided by all permutation matrices [34]. 

Classification of frames is often up to some unitary equivalence [31]. Adopting the terminology of Holmes 
and Paulsen [35], we say two frames in R^, $ = {4>k}k=i ^'^^ * = {V'fclfeli) ^'^e 

(i) Type I equivalent if there is an orthogonal matrix U such that ijjk = U4)k for all fc; 

(ii) Type II equivalent if there is a permutation cr(-) on {1,2,..., M} such that ipk = 4'a{k) for all k; and 

(iii) Type III equivalent if there is a sign function in k, 5{k) = ±1 such that Vfe = S{k)^k for all k. 

It will be evident that FPQ performance is always invariant to Type II equivalence; invariant to Type I 
equivalence when the source distribution is rotationally invariant; and invariant to Type III equivalence 
under Variant II but not, in general, under Variant I. 

It is important to note that for M = N+1 there is exactly one equivalence class of UNTFs [31, Thm. 2.6]. 
Since HTFs are always UNTFs, the following property follows directly from [31, Thm. 2.6]. 

Proposition 3.5. Assume that M = N + 1, and ^ = {M^^i C is the real HTF. Then every UNTF 
^ = {V'fejfeli can be written as 

i)k = mU<l>a{k), fork = l,2,...,M, (11) 

where 6{k) = ±1 is some sign function in k, U is some orthogonal matrix, and a(-) is some permutation on 
the index set {1, 2, ... , M}. 

Another important subclass of UNTFs is defined as follows: 

Definition 3.6 ([36, 37]). A UNTF $ = {(pk^U C is called an equiangular tight frame (ETF) if 
there exists a constant a such that \{^e, = a for all 1 < £ < k < M. 

ETFs are sometimes called optim,a,l Grassmannian frames or 2-uniform fram,es. They prove to have rich 
application in communications, coding theory, and sparse approximation [36, 38, 35]. For a general pair 
(M, A^), the existence and constructions of such frames is not fully imderstood. Partial answers can be found 
in [37, 39, 40]. 

In our analysis of FPQ, we will find that restricted, ETFs where the absolute value constraint can be 
removed from Definition 3.6 play a special role. In matrix view, a restricted ETF satisfies F*F — rl^ and 
FF* = (1 — a)lM + o.Jm, where Jm is the all-Is matrix of size M x M. The following proposition specifies 
the restricted ETFs for the codimension-1 case. 

Proposition 3.7. For M = N +1, the family of all restricted ETFs is constituted by the Type I and Type II 

equivalents of modulated HTFs. 

Proof. See Section 7.1. 

The following property of modulated HTFs in the M = A'' + 1 case will be very useful. 

Proposition 3.8. If M = N + 1 then a modulated harmonic tight frame is a zero-sum frame, i.e., each 
column of the analysis frame operator F sums to zero. 



Proof. We only consider the case when N is even; the N odd case is similar. For each i G {1,2,..., N}, 
let (f>l, denote the £th component of vector (pk and let Se = J2h=i "^1 denote the sum of the entries in column 
£ of matrix F. 

For 1 < i < N/2, using Euler's formula, we have 

k=0 



M-1 



fe=0 

M-l Af-1 
k=0 k=0 



^j{2e-l)kn/M _|_ g-j(2^-l)fc7r/M 



((2^-l)/M+l)fe 



1 _ gj7r(2£+M-l) ^ _ g-j7r(2^+M-l) 

+ ■ 



1 _ eJ>((2^-l)/M+l) ' I _ g-j,r((2«-l)/M+l) 
= 0, (12) 

where (12) follows form the fact that 2£ + M — 1 — 2£ + N is an even integer. 

For N/2 < £ < N, we can show that = similarly, and so the proposition is proved. 



3.4- Reconstruction from Frame Expansions 

A central use of frames is to formalize the reconstruction of x € M.^ from the frame expansion yk = 
{x, (pk), k = 1, 2, ... , M, or estimation of x from degraded versions of the frame expansion. Using the 
analysis frame operator we have y = Fx, and (7) implies the existence of at least one linear synthesis 
operator G such that GF = 1^. A frame with analysis frame operator G* is then said to be dual to 

The frame condition (7) also implies that F*F is invertible, so the Moore-Penrose inverse (pseudo- 
inverse) of the frame operator 

— (^p* p* 

exists and is a valid synthesis operator. Using the pseudo-inverse for reconstruction has several important 
properties including an optimality for mean-squared error (MSE) under assumptions of uncorrelated zero- 
mean additive noise and linear synthesis [3, Sect. 3.2]. This follows from the fact that FF'^ is an orthogonal 
projection from M.^ onto the subspace F(IR^), the range of F. Because of this special role, reconstruction 
using F^ is called canonical reconstruction and the corresponding frame is called the canonical dual. In this 
paper, we use the term linear reconstruction for reconstruction using an arbitrary linear operator. 

When y is quantized to ^, it is possible for the quantization noise y—y to have mean zero and uncorrelated 
components; this occurs with subtractive dithered quantization [20] or under certain asymptotics [41]. In 
this case, the optimality of canonical reconstruction holds. However, it should be noted that even with these 
restrictions, canonical reconstruction is optimal only amongst linear reconstructions. 

When nonlinear construction is allowed, quantization noise may behave fundamentally differently than 
other additive noise. The key is that a quantized value gives hard constraints that can be exploited in 
reconstruction. For example, suppose that y is obtained from y by rounding each element to the nearest 
multiple of a quantization step size A. Then knowledge of jjk is equivalent to knowing 

yfcG [yfe-iA, ?yfe + iA]. (13) 

Geometrically, (,t, (t>k) = fjk — ^A and {x, cpk) = fjk + ^A arc; hyperplanes perpendicular to 0^. and (13) 
expresses that x must lie between these hyperplanes; this may be visualized as one pair of parallel lines in 
Fig. 1(d). Using the upper and lower bounds on all M components of y, the constraints on x imposed by y 
are readily expressed as (see [6]) 

r F 

-F 



X < 



(14) 



Algorithm 1 Consistent Reconstruction from Scalar-Quantized Frame Expansion 



Inputs: Analysis frame operator F , quantization step size A, and quantized frame expansion y 
Output: Estimate x consistent with y and as far from the partition boundaries as possible 



1. Let A = 



F Imxi 
-F Imxi 



and h = 


y 




. -y . 



(Consistency as in (14) is expressed as A 
2. Let c 



<b.) 



OjVxl 

-1 



3. Use a linear programming method to minimize ^ 
Return the first N components of the minimizer as x. 



subject to A 



< b. 



Algorithm 2 Recursive Estimation from Scalar-Quantized Frame Coefficient Sequence 



Inputs: Analysis frame sequence {(pjjj^i, quantization step size A, and quantized coefficients {yj}fLi 
Output: Estimate x 

1. Let ^0 = Ojvxi and let k = 1. 

2. If (xfc_i, (jifc) <yk- ^A, let Xk = ife-i + (Vk - 5^ - (ifc-i, (f'k))(f>k/\\4'k\\'^, 
else if {xk-i, (t>k) >yk + ^A, let Xk = ife-i + {jjk + |A - {xk-i, (t>k))4>kl\\(l>kf'--, 
else let Xk = Xk-\- 

3. If fc = M, return Xk', else increment k and go to step 2. 



where the inequalities arc clcmcntwise. For example, all 2M constraints specify a single cell in Fig. 1(d). This 
formulation inspires Algorithm 1, which is a modification of [6, Table I] using the principle of maximizing 
slackness of inequalities that was also implemented in [8]. Section 4.3 presents analogues to (14) and 
Algorithm 1 for FPQ. 

The cost of Algorithm 1 may be prohibitive if M is large. In particular, Algorithm 1 uses a linear program 
with N + 1 variables and 2M constraints, and solving this has cost that is super linear in M. One way to 
reduce the cost to linear in M is to use each of M quantized coefficients only once and in a computation with 
constant cost. Algorithm 2 uses each constraint (13) once, recursively, in isolation of all other constraints. 
It uses yk by orthogonally projecting a running estimate Xk-i to the set consistent with (13). Remarkably, 
even though the final estimate is generally not consistent with all M constraints of the form (13), the optimal 
9(M~^) decay of ||a; — as a function of the number of coefficients M can be attained under appropriate 
technical conditions [8, 12]. 



4. Frame Permutation Quantization 

With background material on permutation source codes and finite frames in place, we are now prepared 
to formally introduce frame permutation quantization. FPQ is simply PSC applied to a frame expansion. 

Encoder Definition and Canonical Decoding 

Definition 4.1. A frame permutation quantizer with analysis frame F G W^^^ , composition m = (mi, m2, . 
and initial codeword yinit compatible with m encodes x e R''^ by applying a permutation source code with 
composition m and initial codeword yinit to Fx. The canonical decoding gives x = F'^y, where y is the PSC 
reconstruction ofy. 



The two variants of PSCs yield two variants of FPQ. We sometimes use the triple {F, m, yinit) along with a 
specification of Variant I or Variant II to refer to such an FPQ. 

For Variant I, the result of the encoding can be expressed as a permutation P from the permutation 
matrices of size M. The permutation is such that PFx is m-descending. For uniqueness in the representation 
P chosen from the set of permutation matrices, we can specify that the first mi components of Py are kept 
in the same order as they appeared in y, the next TO2 components of Py are kept in the same order as they 
appeared in y, etc. Then P is in a subset 0(m) of the M x M permutation matrices and 

M! 

mi! 7712! • • • rriKi 

Notice that, analogous to the discussion in Section 3.2.2, the encoding uses the composition m but not the 
initial codeword yinit- The PSC reconstruction of y is P^^?}inii: so the canonical decoding gives P~^yinit- 
For Variant II, we will sidestep the differences between the /j-k = and ij.k ^ cases in Section 3.2 by 
specifying that the signs of the rriK smallest-magnitude components of Fx are not encoded and niK = is 
allowed. Now the result of encoding can be cxprcssc;d similarly as P G Q{m) along with a diagonal matrix 

V with ±1 on its diagonal. These matrices are selected such that the elementwise absolute values of VPFx 
are m-descending and also the first M — niK entries of VPFx are positive. The last rriK diagonal entries of 

V do not affect the encoding and are set to Thus F is in a subset Q(m) of the M x M sign-changing 
matrices and 

|Q(m)| = 2^-™^. (16) 

The PSC reconstruction of y is P~^V~^yinit, so the canonical decoding gives F^P~^V~^yinit- 

The sizes of the sets Q{m) and ^(m) x Q(m) are analogous to the codebook sizes in (2), and the per- 
component rates of FPQ are thus defined as 

Ri = log2 — j j r, for Variant I, (17a) 



mi! m2! • • • m/f! 



and 



= N~'^ ( M -rriK + logo ; ^ r ) , for Variant II. 

\ mi\m2'- ■ ■ ■ rriK- J 



(17b) 



4-2. Expressing Consistency Constraints 

Suppose FPQ encoding of a; e with frame F G M*^^^, composition m = (mi, m2, . . . , mj^), and 
initial codeword j/init compatible with m results in permutation P e 5(m) (and, in the case of Variant II, 
V £ Q(m)) as described in Section 4.1. We would like to express constraints on x that are specified by 
{F,m,yinit, P) (or {F,m,yinitT PtV)). This will provide an explanation of the partitions induced by FPQ 
and lead to reconstruction algorithms in Section 4.3. 

Knowing that a vector is m,-desccnding is a specification of many inequalities. Recall the definitions of 
the index sets generated by a composition given in (5), and use the same notation with n^s replaced by 
mfeS. Then z being m-descending implies that for any i < j, 

Zk > zi for every k G li and £ Glj. 

By transitivity, considering every i < j gives redundant inequalities. Taking only j = i + 1, we obtain a full 
description 

Zk > ze for every k G li and £ G li+i with i = 1, 2, . . . , if — 1. (18) 

For one fixed {i,£) pair, (18) gives \Xi\ = rrii inequalities, one for each k G Xj. These inequalities can be 
gathered into an elementwise matrix inequality as 



where = mi+ m2 H + ruk, or D^^^z > 0^4x1 where 

^i^^ = [ OmiXMi_i Imi Om.x(^-Mi-l) ^ ImiXl OmiXM-i ] (19a) 

is an nii X M differencing matrix. Allowing £ to vary across li+i, we define the niimi+i x M matrix 



D 



D 
D 



(m) 

iMi + 1 



D. 



(m) 
,Mi+2 



(m) 

,Mi+mi+i J 



(19b) 



and express all of (18) for one fixed i as D^^\ > Qmirm+ixi- 

Continuing our recursion, it only remains to gather the inequalities (18) across i G {1,2, . . . ,K —1}. Let 



(m) 



D 



(m) 



D 



(m) 
K-1 



(19c) 



which has 



L{m) = ^ TOimj+i 



(20) 



i=l 



rows. The property of z being m-descending can be expressed as D^'^^z > OL{Tn)xM- The following example 
illustrates the form of Z)^"*^: 



£)((2,3,2)) ^ 
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Notice the important property that each row of Z)("') has one 1 entry and one —1 entry with the remaining 
entries 0. This will be exploited in Section 5. 

Now we can apply these representations to FPQ. 

Variant I: In this case, we know PFx is m-dcscending. Consistency is thus simply expressed as 



D^"'^PFx > 0. 



(21) 



Variant II: The second variant has an m-descending property after V has made the signs of the significant 
frame coefficients (all but last rriK) positive: D^'^^VPFx > 0. In addition, we have the nonnegativity of all 
of the first M — rriK sorted and sign-changed coefficients. To specify 



{Im-ttik 0(M-ms:)xM] VPFx > 0(M-ms:)xl 



Algorithm 3 Estimation of Source on [— |, for Variant I Frame Permutation Quantization 



Inputs: Analysis frame operator F, composition m, and FPQ encoding P 

Output: Estimate x consistent with {F, m, P) and as far from the partition boundaries as possible 



L(m) X 1 
I27VXI 



-D(^)PF lL(mUl 1 ^ p 

1. Let A = — 7;v Iwxi and 6 = ;^ 

In Iwxi _ 

where £)('") is defined in (19) and L(m) is defined in (20). 
(Consistency with (21) and x e [— ^, is expressed as A 



2. Let c = 



-1 



3. Use a linear programming method to minimize (? 



R(>luru (lie lirst. _Y comixjiu'iils of tlx- miiiimiz(>r as .'v. 



X 





<b.) 



subject to A 



< b. 



is redundant with what is expressed with the m-descending property. The added constraints can be applied 
only to the entries of VP Fx with indexes in Ik-i because all the earlier entries are already ensured to be 
larger. We thus express consistency as 

'• -V 

4-3. Consistent Reconstruction Algorithms 

The constraints (21) and (22) both specify unbounded sets, as discussed previously and illustrated in 
Fig. 1(e) and (f). To be able to decode FPQs in analogy to Algorithm 1, we require some additional 
constraints. We develop two examples: a source x bounded to [—5,5]^ (e-g-, having an i.i.d. uniform 
distribution over [—5,5]) or having an i.i.d. standard Gaussian distribution. For the remainder of this 
section, we consider only Variant I because adjusting for Variant II using (22) is easy. 

4.3.1. Source Bounded to [— ^, 

To impose (21) along with x S ["^j^]^ is trivial because x G [~^)^]''^ is decomposable into 2N 
inequality constraints: 

Iatxi 
— Ia^xi 

A linear programming formulation will return some corner of the consistent set, depending on the choice of 
cost function. The unknown vector x can be augmented with a variable 5 that represents the slackness of 
the inequality constraint with the least slack. Maximizing 5 moves the solution away from the boundary 
of the consistent set (partition cell) as much as possible. Reconstruction using this principle is outlined in 
Algorithm 3. 

If the source x is random and the distribution p{x) is known, then one could optimize some criterion 
explicitly. For example, one could maximize p{x) over the consistent set or compute the centroid of the 
consistent set with respect to p{x). This would improve upon reconstructions computed with Algorithm 3 
but presumably increase complexity greatly. 

4.. 3. 2. Source with i.i.d. Standard Gaussian Distribution 

Suppose X has i.i.d. Gaussian components with mean zero and unit variance. Since the source support is 
unbounded, something beyond consistency must be used in reconstruction. Here we use a quadratic program 
to find a good bounded, consistent estimate and combine this with the average value of ||a;||. 



VPFx > 0. (22) 



In 
-In 



' ^ 2 



Algorithm 4 Estiiiiatioii orA'(()./v) Source^ for \ariaiit, I Frame P(>riiiut,atrc)ii Quaiitrzatioii 



Inputs: Analysis frame operator F, composition m, and FPQ encoding P 

Output: Estimate x consistent with (F, m, P) and as far from the partition boundaries as possible 
while keeping = 



1. Let ^ = [ -Z)(")PF 1 



L{m)xl 



] and 6 = 



L(m)xl) 



where D^™) is defined in (19) and L{m) is defined in (20). 



(Consistency with (21) is expressed as A 
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<b.) 



OjVxl 

-1 
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OjVxl 





2. Let c = 

3. Use a quadratic programming method to minimize i 
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_ 5 _ 
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S 
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_ S _ 



subject to A 



< b. Denote the first A'' components of the minimizer as x, 



ang- 



4. Return (V27r//3(7V/2, 1/2)) 



The problem with using (21) combined with maximization of minimum slackness alone (without any 
additional boundedness constraints) is that for any purported solution, multiplying by a scalar larger than 
1 will increase the slackness of all the constraints. Thus, any solution technique will naturally and correctly 
have oo. Actually, because the partition cells are convex cones, we should not hope to recover the 

radial component of x from the partition. Instead, we should only hope to recover a good estimate of a;/||a;||. 

To estimate the angular component x/||a;|| from the partition, it would be convenient to maximize min- 
imum slackness while also imposing a constraint of = 1. Unfortunately, this is a nonconvex constraint. 
It can be replaced by ||a;|| < 1 because slackness is proportional to ||a;||. This suggests the optimization 

maximize S subject to ||a;|| < 1 and D^'^^PFx > ^1i,(to)xi- 

Denoting the x at the optimum by Xang, we still need to choose the radial component, or length, of x. 
For the j\A(0,7jv) source, the mean length is [42] 



We can combine this with x^ng to obtain a reconstruction x. 

We use a slightly different formulation to have a quadratic program in standard form. We combine the 
radial component constraint with the goal of maximizing slackness to obtain 

minimize ^x^x — XS subject to —D^"^^PFx < — <^lL(m)xi) 

where A trades off slackness against the radial component of x. Since the radial component will be replaced 
with its expectation, the choice of A is immaterial; it is set to 1 in Algorithm 4. 

4-4- Recursive Estimation Algorithms 

Algorithms 3 and 4 are practical for small values of N and M, as one would encounter in data compression 
applications, but not for large values of A'' and AI, as may be of interest in data acquisition. Specifically, 
Algorithm 3 uses a linear program with A'' + 1 variables and L{m) + 2N constraints, while Algorithm 4 
uses a quadratic program with N + 1 variables and L{m) constraints. The costs of these computations are 
super linear in L{m), and L{m) is at least M — 1. 



Algorithm 5 Recursive Estimation for Variant I Frame Permutation Quantization 



Inputs: Analysis frame sequence FPQ representation I'ij = sign((a;, — for j = 1,2, . . . , M, 

i — 1,2, . . . ,j — 1, and index sets J2, Js, • • • , Jm with jTit C {1, 2, . . . , fc — 1} 
Output: Estimate xm 

1. Let xi be a vector chosen uniformly at random from the unit sphere in and let k = 2. 

2. Let Xk = Xk-i- 

3. For each j G Jk, taken in random order: 

3a. Let ^pkj — 4'k ^ 

3b. If sign((ifc, Vfej)) 7^ t'/cj, update Xk to Xk - {xk, 'ipkj)ipkj/\\'ipkj\\'^- 

4. If A; = M, return else increment k and go to step 2. 



One way to lower the reconstruction complexity is to sacrifice global consistency in favor of recursive 

computability, in analogy to Algorithm 2. For recursive algorithms, we restrict our attention to the compo- 
sition TO = (1, 1, ...,1). We also again restrict our attention to Variant I because adjusting for Variant II 
is straightforward. 

With all-Is compositions, FPQ encoding produces a successive or embedded representation of x: a 
representation with a fc-element analysis frame is a ranking of {{x, (for Variant I), and adding 

a vector <pk+i to the analysis frame amounts to inserting (a;, ^fe+i) in the ranked list. Equivalently, the 
encoding of x with a fc-element frame is the set 

sign((a;, (t)i) - {x, 4>j)), i,j e {1, 2, . . . , k}, 

and adding (pk+i to the analysis frame adds 

sign((x, ^fe+i) - {x, j e {1, 2, . . . , k}, 

to the representation without removing any of the previous information. 

In estimating x from FPQ with the full {k + l)-element analysis frame, one could impose 

sign((f , (^i - = sign((a;, ?!>i - i, j e {1, 2, . . . , fc + 1} (23) 

(equivalent to (21)) for global consistency. However, for a recursive computation we will compute an estimate 
Xk+i from an estimate Xk and some subset of the constraints 

s\ga{{x, (t)k+i- (l)j))=s\ga{{x, (l)k+i- j e {1, 2, . . . , A;}. (24) 

Updating Xk to satisfy any of the constraints (24) can caiisc constraints (23) with i < fc + 1 to be violated, so 
a strategy of imposing local consistency does not cnsiirc global consistency. However, we will demonstrate 
by extending results from [8, 12] that optimal MSE decay as a function of M can still be obtained. 

A recursive computation that uses local consistency is described explicitly in Algorithm 5. For each k, 
the set Jk represents the indexes j for which the constraint (24) is used. Any one of them is used (in Step 
3b) by orthogonally projecting the running estimate Xk-i to the half-space consistent with (24). This yields 
a monotonicity result analogous to [8, Thm. 1] and [12, Lem. 7.2]: 

Theorem 4.2. Let x e M-'^ be a unit vector. The sequence of estimates produced in Algorithm 5 satisfies 

\\x - Xk+l\\ < \\x - Xk\\. 

Proof. Since Step 3b is an orthogonal projection to a convex set containing x, no occurrence of this step 
can increase the estimation error. 



The number of projection steps in Algorithm 5 depends on the sizes of the Jk^. At one extreme, each 

Jk is a singleton so there are M — 1 projections. At the other extreme, each Jk has k — 1 elements and 
there are ^M[M — 1) projections. The empirical behavior in Section 6.3 shows that the MSE decays as the 
square of the number of projection steps. This behavior is provable in some cases. One such result is the 
following theorem, analogous to [8, Thm. 2] and presumably extendable to match [12, Thm. 7.9]: 

Theorem 4.3. Let x € he a unit vector, let {<pk}'kLi be an i.i.d. sequence of vectors drawn from the 
uniform distribution on the unit sphere in , and let each Jk he a singleton subset o/{l, 2, . . . , A: — 1} for 
fc e {2, 3 . . .}. Then the normalized sequence of estimates produced by Algorithm 5 satisfies 

\\x — XkW'^ A:^ — >■ almost surely, for every p < 2. (25) 

Proof. We give only a brief sketch of a proof since the main ideas have been developed by Rangan and 
Goyal [8] and Powell [12]. 

According to [8, Thm. 2] , Algorithm 2 gives performance satisfying (25) under the following assumptions: 

1. Quantization noise is on a known, bounded interval; 

2. the frame sequence is i.i.d. and independent of the quantization noise; and 

3. the frame vectors are bounded and satisfy 

E[\{z,(j)k)\]>£\\z\\ for all 2 eM^ 

for some e > 0. 

Assumption 1 can be loosened to quantization noise known to lie in [— l,oo) or (— oo,l] without changing 
the conclusion that (25) holds; it is qualitatively like discarding half of the quantized frame coefficients since 
each frame coefficient gives one half-space constraint rather than two. Assumption 3 is a very mild condition 
that simply ensures that there is no nonzero vector z E such that all of the probability mass of the frame 
vector distribution is orthogonal to z; such orthogonality would obviously make it impossible to recover z 
from the frame coefficient sequence. 

An FPQ representation is through sign((a::, ipkj)) where tj^kj = ^fe — (f'j- This is equivalent to a quantized 
frame expansion with analysis vectors ijjkj and quantization noise bounded to [— l. oc) when the signum 
function returns 1 and to (— oo, 1] when the signum function rctiuns —1. We are considering reconstruction 
where each jTfe is a singleton, so write Jk = {jk}- Now the i.i.d. uniform distribution on {^k}'kLi makes 
{V-'fejfrli^Li ''^ sequence that satisfies Assumptions 2 and 3. By eliminating the radial component of x, one 
can now show that the error decay (25) holds by using [8, Thm. 2] 

Extensions of [8, Thm. 2] and [12, Thm. 7.9] to non-i.i.d. frame sequences would lead to extensions of 
Theorem 4.3 beyond singleton J^s. 

5. Conditions on the Choice of Frame 

In this section, we provide necessary and sufficient conditions so that a linear reconstruction is also 

consistent. We first consider a general linear reconstruction, x = Ry, where R is some N x M matrix and 
y is a decoding of the PSC of y. We then restrict attention to canonical reconstruction, where R = For 
each case, we describe all possible choices of a "good" frame F, in the sense of the consistency of the linear 
reconstruction. 

5.1. Arbitrary Linear Reconstruction 

We begin by introducing a useful term. 

Definition 5.1. A matrix is called column-constant when each column of the matrix is a constant. The set 
of all M X M column- constant matrices is denoted J . 



We now give our main results for arbitrary linear reconstruction combined with FPQ decoding of an 
estimate of y. 



Theorem 5.2. Suppose A = FR = qIm + J for some a > and J £ J . Then the linear reconstruction 
X = Ry is consistent with Variant I FPQ encoding using frame F, an arbitrary composition and an arbitrary 
Variant I initial codeword compatible with it. 

Proof. We start the proof by pointing out two special properties of any matrix J G J: 
(PI) PJP~^ e J for any permutation matrix P; and 
(P2) Z)^™^ J = Oi(„)xi for any composition m. 

(PI) follows from the fact that neither left multiplying by P nor right multiplying by P^^ disturbs column- 
constancy. (P2) is true because each row of D^™) has zero entries except for one 1 and one —1. 

Suppose m — (mi, m2, . . . , Wif ) is an arbitrary composition of M and yinit is an arbitrary Variant I 
initial codeword compatible with m. Let P be the Variant I FPQ encoding of x using {F,m,yinit)- We 
would like to check that x = Ry is consistent with the encoding P. This is verified through the following 
computation: 

D<^"')pFx = D^'^^PFRy 

= £)(")PFiiP-iyinit (26) 

= i?(")p(a/M + J) P-'yinit (27) 
= ai?(")yinit + £'^'"^Jyi„it for some J e J (28) 
= ai^^^'yinit (29) 

> OL(m)xl> (30) 

where (26) uses the conventional decoding of a PSC; (27) follows from the hypothesis of the theorem on 
A; (28) follows from (PI); (29) follows from (P2); and (30) follows from the definition of Variant I initial 
codewords compatible with m, and the nonnegativity of a. This completes the proof. 

The key point of the proof of Theorem 5.2 is showing that the inequality 

£)(™)P^P-Iyinit > 0, (31) 

where A = FR, holds for every composition m and every initial codeword fjinn compatible with it. It turns 
out that the form of matrix A given in Theorem 5.2 is the unique form that guarantees that (31) holds for 
every pair (m.yinit)- In other words, the condition on A that is sufficient for any composition m and any 
initial codeword yinit compatible with it is also necessary for consistency for every pair (m, yinit)- 

Theorem 5.3. Consider Variant I FPQ using frame F with M > 3. If linear reconstruction x = Ry 
is consistent with every composition and every Variant I initial codeword compatible with it, then matrix 
A = FR must be of the form alu + J, where a > and J & J. 

Proof. See Section 7.2. 

The column-constant matrices are those obtained by multiplying an all-Is matrix on the right by a 
diagonal matrix. Thus, except in the case of an all-Os matrix, a column-constant matrix has rank 1. A 
matrix of the form o7m + J where a > and J & J thus has rank M — 1 or M. Since A = FR has rank 
at most N because of the dimensions of F and R, the necessary and sufficient condition from Theorems 5.2 
and 5.3 imply M = No-i:M = N + l. 

Similar necessary and sufficient conditions can be derived for linear reconstruction of Variant II FPQs. 
Since the partition cell associated with a codeword of a Variant II FPQ is much smaller than that of the 
corresponding Variant I FPQ, we expect the condition for a linear reconstruction to be consistent to be more 
restrictive than that given in Theorems 5.2 and 5.3. The following two theorems show that this is indeed 
the case. 



Theorem 5.4. Suppose A = FR = uIm for some a > and M = N. Then the linear reconstruction 
X — Ry is consistent with Variant 11 FPQ encoding using frame F, an arbitrary composition, and an 

arbitrary Variant II initial codeword compatible with it. 

Proof. Suppose that m = {mi,m2, ■ ■ ■ ,mK) is an arbitrary composition of M and yinit is an arbitrary 
Variant II initial codeword compatible with it. Let (P, V) be the Variant II FPQ encoding of x using 
(F, m, 2/init)- We would like to check that x = Ry is consistent with the encoding {P,V). This is verified 
through the following computation: 

D^'^WpFx = b^'^WPFRy 

= D^"''^VPFRP-^V-^yinit (32) 

= T/Po/mP- V-iyinit (33) 

> OL(m)xl, (34) 

where (32) uses the conventional decoding of a PSC; (33) follows from the hypothesis of the theorem on A; 
and (34) follows from the definition Variant II initial codewords compatible with m, and the nonnegativity 
of a. This completes the proof. 

Theorem 5.5. Consider Variant II FPQ using frame F with M > 3. // linear reconstruction x = Ry 
is consistent with every com,position and every Variant II initial codeword compatible with it, then matrix 
A ~ FR must be of the form oIm, where o > and M = N . 

Proof. See Section 7.3. 



The two theorems above show that, if we insist on linear consistent reconstructions for Variant II FPQs, 
the frame must degenerate into a basis. For nonlinear consistent reconstructions, we could use algorithms 
analogous to those presented in Section 4.3 for an arbitrary frame that is not necessarily a basis. 

5.2. Canonical Reconstruction 

We now restrict the linear reconstruction to use the canonical dual; i.e., R is restricted to be the pseudo- 
inverse P^ = {F* F)~^ F* . The following corollary characterizes the non-trivial frames for which canonical 
reconstructions are consistent. 

Corollary 5.6. Consider Variant I FPQ using rank-N frame F with M > N and M > 3. For canonical 
reconstruction to be consistent with every composition and every Variant I initial codeword compatible with 
it, it is necessary and sufficient to have M = N + 1 and A = FF^ = — jjJm, where Jm is the M x M 
all- Is matrix. 

Proof. Sufficiency follows immediately from Theorem 5.2. From Theorem 5.3, it is necessary to have 
A = PPt = alM + J for some a > and J ^ J . The rank condition further implies a > 0, so we must have 
M = iV + 1 by the argument following Theorem 5.3. Now since A is an orthogonal projection operator, it 
is self- adjoint so 

ahi + J = {alM + J)* = oIm + J*- (35) 
Thus, J = J*, and it follows that J = 6Jm, for some constant b. The idempotence of A gives 

alM + bJu = {alM + bJnf 

= o^Im + (2a6 + b'^M)JM. (36) 



Equating the two sides of (36) yields a = 1 and b = 



— 1/M as desired. 



We continue to add more constraints to frame F. Tightness and equal norm are common requirements in 
frame design [1] . By imposing tightness and unit norm on our analysis frame, we can progress a bit further 

from Corollary 5.6 to derive the form of FF*. 

Corollary 5.7. Consider Variant I FPQ using unit-norm tight frame F with M > N and M > 3. For 
canonical reconstruction to be consistent for every composition and every Variant I initial codeword compat- 
ible with it, it is necessary and sufficient to have M = N + 1 



FF* = 
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(37) 



Proof. Corollary 5.6 asserts that M = N +1 and 
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(38) 



On the other hand, the tightness of a unit-norm frame F implies 



{F*Fy 



M 



[JV- 



(39) 



Combining (38) with (39), we get (37). 



Recall that a UNTF that satisfies (37) is a restricted ETF. Therefore Corollary 5.7 together with 
Proposition 3.7 gives us a complete characterization of UNTFs that are "good" in the sense of canonical 
reconstruction being consistent. 

Corollary 5.8. Consider Variant I FPQ using unit-norm tight frame F with M > N and M > 3. For 

canonical reconstruction to be consistent for every composition and every Variant I initial codeword compat- 
ible with it, it is necessary and sufficient for F to be a modulated HTF or a Type I or Type II equivalent. 



6. Numerical Results 

In this section, we provide simulations to demonstrate some properties of FPQ. For data compression, 
we demonstrate that FPQ with decoding rising by Algorithms 3 and 4 can give performance better than 
ECSQ and ordinary PSC for certain combinations of signal dimension and rate. For data acquisition, 
we demonstrate that FPQ with recursive estimation through Algorithm 5 empirically gives the optimal 
decay of MSE, inversely proportional to the square of the number of orthogonal projection steps, validating 
Theorem 4.3 but also suggesting that this holds more generally. 

6.1. Fixed-Rate Compression Experiments 

All FPQ compression simulations use modulated harmonic tight frames and are based on implementations 
of Algorithms 3 and 4 using MATLAB, with linear programming and quadratic programming provided by 
the Optimization Toolbox. For every data point shown, the distortion represents a sample mean estimate 
of -A''~-'^i?[||x — over at least 10^ trials. Testing was done with exhaustive enumeration of the relevant 
compositions. This makes the complexity of simulation high, and thus experiments are only shown for small 
N and M. Recall the encoding complexity of FPQ is low, 0{M log M) operations. The decoding complexity 
is polynomial in M for either of the algorithms presented explicitly, and in some applications it could be 



ECSQ 

= PSC 
« M = 5 
V M = 6 
M = 7 




(a) iV 
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Figure 2: Performance of Variant I FPQ on an i.i.d. uniform([— i, i]) source using modulated liarmonic tight frames ranging 
in size from A'^ to 7. Also shown are the performances of ordinary PSC (equivalent to FPQ with frame F = 7jv), and 
entropy-constrained scalar quantization. 



worthwhile to precompute the entire codebook at the decoder. Thus much larger values of N and M than 
used here may be practical. 

Uniform source. Let a; have i.i.d. components uniformly distributed on [— i, i]. Algorithm 3 is clearly 
well-suited to this source since the support of x is properly specified and reconstructions near the centers 
of cells is nearly optimal. Fig. 2 summarizes the performance of Variant I FPQ for several frames and 
an enormous number of compositions. Also shown are the performances of ordinary PSC and entropy- 
constrained scalar quantization. 

Using F — In makes FPQ reduce to ordinary PSC. We see that, consistent with results in [29], PSC is 
sometimes better than ECSQ. Next notice that FPQ is not identical to PSC when F is square but not the 
identity matrix. The modulated harmonic frame with M — N provides an orthogonal matrix F. The set 
of rates obtained with M — N is the same as PSC, but since the source is not rotationally-invariant, the 
partitions and hence distortions are not the same; the distortion is sometimes better and sometime worse. 
Increasing M gives more operating points — some of which are better than those for lower M — and a higher 
maximum rate."* In particular, for both = 4 and A^ = 5, it seems that M = A^ + 1 gives several operating 
points better than those obtainable with larger or smaller values of M. 

Gaussian source. Let x have the Af(0, In) distribution. Algorithm 4 is designed precisely for this source. 
Fig. 3 summarizes the performance of Variant I FPQ with decoding using Algorithm 4. Also shown are 
the performance of entropy-constrained scalar quantization and the distortion-rate bound. Of course, the 
distortion-rate bound can only be approached with A^ — >■ oo; it is not presented as a competitive alternative 
to FPQ for Af = 4 and TV = 5. 

We have not provided an explicit comparison to ordinary PSC because, due to rotational-invariance of 
the Gaussian source, FPQ with any orthonornial basis as the frame is identical to PSC. (The modulated 
harmonic tight frame with M = A^ is an orthonornial basis.) The trends are similar to those for the uniform 
source: PSC and FPQ are sometimes better than ECSQ; increasing AI gives more operating points and a 
higher maximum rate; and M = A^ + 1 seems especially attractive. 



*A discussion of the density of PSC rates is given in [43, App. B]. 
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Figure 3: Performance of Variant I FPQ on an i.i.d. J\f{0, 1) using modulated harmonic tight frames ranging in size from A'^ 
to 7. Performance of PSC is not shown because it is equivalent to FPQ with M = N for this source. Also plotted are the 
performance of entropy-constrained scalar quantization and the distortion-rate bound. 



6.2. Variable-Rate Compression Experiments and Discussion 

We have posed FPQ as a fixed-rate coding technique. As mentioned in Section 3.2, symmetries will often 
make the outputs of a PSC equally likely, making variable-rate coding superfluous. This does not necessarily 
carry over to FPQ. 

In Variant I FPQ with modulated HTFs, when M > N +1 the codewords are not only nonequiprobable, 
some cannot even occur. To see an example of this, consider the case of {N, M) = (2, 4). Then 



F = 



1 





-c ~c 



< 



where C, denotes l/\/2. 



If we choose the composition m = (2,2), we might expect six distinct codewords that are equiprobable for 
a rotationally-invariant source. The permutation matrices consistent with this composition are 
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The first and fifth of these occur with probability zero because the corresponding partition cells have zero 



volume. Let us verify this for the fifth permutation matrix (P 
the fifth cell is described by 
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By forming D'-'^^'^^^IiF, we see that 



(40) 



This has no nonzero solutions. (Subtracting the second and third inequalities from the first gives 2Cxi > 0, 
which combines with the fourth inequality to give xi =0. With xi =0, the first and third inequalities 
combine to give X2 — 0.) 
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Figure 4: Performance of Variant I FPQ for fixed- and variable-rate coding of an i.i.d. uniform([— ^, ^]) source with N = 4: using 
modulated harmonic tight frames of sizes 6 and 7. Also plotted is the performance of entropy-constrained scalar quantization. 

While further investigation of the joint design of the composition m and frame F — or of the product 
varies over the permutations induced by m — is merited, it is beyond the scope of this paper. 
Instead, we have extended our experiments with uniform sources to show the potential benefit of using 
entropy coding to exploit the lack of equiprobable codewords. 

Fig. 4 summarizes experiments similar to those reported in Figs. 2 and 3. Each curve in this figure 
shows, for any given rate R on the horizontal axis, the lowest distortion can be achieved at any rate not 
exceeding R. The source 2: € E'* has i.i.d. components uniformly distributed on [— ^, and Variant I FPQ 
with modulated harmonic tight frames of sizes M — 6 and 7 were used. Performance with rate measured 
only by (15) as before is labeled fixed rate. The codewords are highly noncquiprobable at all but the 
lowest rates. To demonstrate this, we alternatively measure rate by the empirical output entropy and label 
the performance variable rate. Clearly, the rate is significantly reduced by entropy coding at all but the 
lowest rates. 

6.3. Recursive Estimation Experiments 

The recursive estimation technique detailed in Algorithm 5 remains computationally feasible for large N 
and M . Here we simulate it with x G a nonrandom unit vector and {(/>/c}^i an i.i.d. sequence of vectors 
drawn from the uniform distribution on the unit sphere in R^. Several choices for the Jk sets are used: 

• Singleton sets: ~ ^ 1}; 

• Square-root sets: J7fc C {1, 2, . . . , fc — 1} is chosen uniformly at random from subsets of size \_Vk\ ; and 

• Exhaustive sets: J7fe = {1, 2, . . . , fc — 1}. 

Figure 5 shows the sample mean estimate of A^~-'^£'[||2; — over 1000 trials with = 8 and M up to 
10 000. 

With singleton sets, we expect to see \\x — xa/|P — &{M^^) when AI is increased without bound for 
any fixed N; Theorem 4.3 gives an upper bound of this order, and related lower bound results include [19, 
Thm. 6.1] and [8, Thm. 3]. With exhaustive sets, the total number of projections is ^M{M — 1), so squared 
error decay with the square number of projections would give ||a; — a;A/|p — Q{M~'^), and we indeed see 
this. Similarly, the empirical behavior is — xmIP = 6(M^'^) with square-root sets. 
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Analysis frame size IVI 
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Figure 5: Performance of recursive estimation for Variant I FPQ. The signal is of dimension N = 8 and various index sets 
are used. 

7. Proofs 

7.1. Proof of Proposition 3.7 

In order to prove Proposition 3.7, we need the following lemmas. 

Lemma 7.1. Assume that M ^ N+1 and let W = e-'^'^/^^. Then for all a (zM. we have 

y ^a(2.-i)/2 ^ {-^) -W ^ 
/-^ - 1 

i—l 

and 

i=l 

Proof. By noting that 14^*^/^ = —1, we have the following computations. 
N even : 

y-^a(2«-l)/2 ^ ^-a/2 . ^™ ^ ^-a/2 . -M^ 

^ /-^ W" — 1 

-a/2 iW^^I'^T W^"/^ - T4^" _ (-1)" - 

yy" - 1 ^ - 1 

N odd : 

^^^/"^ ^a(7V+l)/2 „ yya (iyM/2N°_^a _ ^ra 



W-l VF" - 1 VF" - 1 

Lemma 7.2. For Af = TV + 1, the HTF $ = {(^fclfcli satisfies {(jjk, 4>i) = (-l)'="^+ViV, for all 1 < k < 

e < M. 



Proof. Consider the two following cases. 

N even : Using Euler's formula, for fc = 0, 1, . . . , M — 1, can be rewritten as 

2 ' 2 ' 2 

W'' - W^'^ W^^ — W~^^ VF(-'V-1)/c _ y^-{N-l)k 



2j 2j ' ' 2j 

For 1 < k < i < M, let a = k — £. After some algebraic manipulations we can obtain 

N/2 N/2 
N-{<Pk,^e) = ^VF('=-^)(2i-l)/2^^^(£-fc)(2i-l)/2 



(-1)" - W"/^ (-1)-" - VK-"/^ 

- 1 w-'^ - 1 ^ ^ 

- 1 (-l)"VF«/2 _ 1 



where (41) is obtained using Lemma 7.1. 

AT odd : Similarly, for 1 < fc < f < M and a = — ^, we have 

(N-l)/2 (N-l)/2 
i=l i=l 



(_l)a]/J/-"/2 _ ^a/2 (^_l'jay^ra/2 _ y^r-a/2 
_ pj/-a/2 ;^a/2 _ ^-a/2 

1- (-1)"- 1 
(-1)"+!, 



where (42) is due to Lemma 7.1. 



Proof (Proposition 3.7). For a modulated HTF * = {'ipk}kLi, as defined in Definition 3.4, for all 
l<fc<^<Mwe have 

= 72(-l)'=+^(-l)'=-^+ViV (43) 

= (-l)'=+^(-l)'=-^+V^ (44) 
= -l/N, (45) 

where (43) is due to Lemma 7.2; and (44) is true because I7I = 1 for all 1 < < ^ < M. Since the inner 
product is preserved through an orthogonal mapping, (45) is true for Type I and/or Type II equivalences of 
modulated HTFs as well. The tightness and unit norm of the HTF are obviously preserved for Type I and/ or 
Type II equivalences. Therefore, the modulated HTFs and their equivalences of Type I and/or Type II are 
all restricted ETFs. 

Conversely, from Proposition 3.5, every restricted ETF ^ = {^'j^li can be represented up to Type I 
and Type II equivalences as follows: 



for all 1 < fc < M, 



where S{k) = ±1 is some sign function on k. Thus, the constraint {ijjk, ipe) = a for some constant a of a 
restricted ETF is equivalent to 



aN = N5{k)5{t)-{<f>k,(j>e) 

= 6{k)5{tj{-lf-^+'^ , iov 8l\l<k <1<M. 

Therefore, 6{k)5{{){-\f-^ is constant for all 1 < fc < £ < M. If we fix k and vary L it is clear that the 
sign of 5{() must be alternatingly changed. Thus, is one of the two HTFs specified in the proposition, 
completing the proof. 

7.2. Proof of Theorem 5.3 

The following lemmas are all stated for Variant II initial codewords. They are somewhat stronger than 
what we need for the proof of Theorem 5.3 because a Variant II initial codeword is automatically a Variant I 
initial codeword. However, these lemmas will be reused to prove Theorem 5.5 in Section 7.3. 

For convenience, if {ii, . . . ik} is a subset of {1, 2, . . . , M} and a is a permutation on that subset, we 
simply write 

^(t(«i) (t(«2) •••cr(ifc)^ 

if Py maps yi^ to ya(ii)i 1 < ^ < fc, and fixes all the other components of vector y. This notation with round 
brackets should not be confused with a matrix, for which we always use square brackets. 

Proofs of the lemmas rely heavily on the key observation that the operator P(-) P~^ first permutes the 
columns of the original matrix, then permutes the rows of the resulting matrix in the same manner. 

Lemma 7.3. Assume that M > 2>. If the entries of matrix A satisfy ak,i ^ a^.i for some 1 < k < £, 
then there exists a pair {P, yinit)? where P is a permutation matrix and yinit is a Variant II initial codeword 
compatible with some composition, such that the inequality (31) is violated. 

Proof. Consider the two following cases. 

Case 1 : If ak,i < a^,i, choose P = Im, and yinit = {iJ,i,fi2, ■ ■ ■ ^fJ-M)- Consider the following difference: 

Afe,^ = {{ak,j)j, yinit) - {{ae,j)j, yinit) 

M M 

(M M 

Fix /i2 > Ms > ■ ■ ■ > Mm > and let /^i go to +oo. Since ak,i < ^ki will go to — oo. Thus, there 
exist fii > ^2 > ■ ■ ■ > fJ'M > such that Ak/ < 0. On the other hand, for to = (1,1,..., 1), inequality (31) 
requires that Afc/ > for all k < £. Therefore the chosen pair violates inequality (31). 

f k £ \ 

Case 2 : If ak,i > o^^.i, choose P = ( ^ k /' ^^^'^^ ^Ji^ 1; the entries of matrix A = PAP~^ will 
satisfy i < i • We return to the first case, completing the proof. 

Lemma 7.4. Assume that M > 3. // the entries of matrix A satisfy Okj 7^ '^e.jj for any pairwise distinct 
triple {k,j,£), then there exists a pair {P,yinit), where P is a permutation matrix and y-mit is a Variant II 
initial codeword compatible with some composition, such that the inequality (31) is violated. 

Proof. We first show that there exists some permutation matrix Pi such that A = Pi AP^"^ satisfies the 
hypothesis of Lemma 7.3. Indeed, consider the following cases: 



1. If J = 1, it is obvious to choose Pi = I, 



M- 



2. If i > 1 and k > 1, choosing P, = ( J ^ yields a,, = ^ a,, = a,, since ^ {!,,}. 

3. If j > 1 and k = 1, choosing = ^ ^ ^ ^ yields aj,i = akj ^ a^j = since k = 1, and 
£ Note that in this case, j ^ 1, and so A satisfies the hypothesis of Lemma 7.3. 

Now with Pi chosen as above, according to Lemma 7.3 there exists a pair (P2, yinit), where P is a permutation 
matrix and yinit is a Variant II initial codeword compatible with some composition, such that 

^ 7^(")P2iP2-lyi„it 

= D(")P2(PiAPfl)P2-lyinit 

= Z)("5PAP-1 yinit, 

where P = P2P1. Since the product of any two permutation matrices is also a permutation matrix, the pair 

(P, yinit) violates the inequality (31). 

Lemma 7.5. Suppose that A is a diagonal matrix. Then the inequality (31) holds for every composition 
and every Variant II initial codeword compatible with it, only if A is equal to the identity matrix scaled by 
a nonnegative factor. 

Proof. Suppose that A = diag(ai, 02, . . . , cm)- We first show that a, > for every i by contradiction. 

If ai < 0, we can choose P = Im and /xi > /X2 > . . . > /xm > 0, where /Ui is large enough relative to 
1^2, - ■■ , A*M to violate inequality (31). 

If aj < for some 1 < j < M, using P = [ ^ ) yields a'l = aj < 0, where a[ is the first entry on 



the diagonal of matrix A' = PAP~^. Repeating the previous argument, we get the contradiction. 

Now we show that if ^ ae for some 1 < k < £ < M, there exists a pair (P, yinit)) where P is a 
permutation matrix and yinit is a Variant II initial codeword compatible with some composition, such that 
inequality (31) is violated. 

Case 1 : if Ofe < ae, choose P = Im and consider yinit = {^1,^2, ■ ■ ■ ,I^m), where = fXk — s for some 
positive s. Choose such that 

l^k > > 0. (46) 

ai — ak 

On the other hand, we can choose e small enough so that fif is positive as well. The other components can 
therefore be chosen to make yinit a Variant II initial codeword compatible with composition m = (l,l,...,l). 
For the above choice of Hk we can easily check that Ak,e = cbk^-k — O'efJ-e < 0, violating inequality (31). 

(k £ \ 
^ ^ j yields Pj4P~^ = diag(ai, 02, a^, flfe, cim)- 

We return to case 1, completing the proof. 

Proof (Theorem 5.3). First note that a Variant II initial codeword is always a Variant I initial codeword, 

therefore, Lemmas 7.3, 7.4, and 7.5 also apply for Variant I initial codewords. From Lemma 7.4, all entries 
on each column of matrix A are constant except for the one that lies on the diagonal. Thus, A can be 
written as A = I + J, where / = diag(ai, 02, . . . , om), and 

bi 62 • • • bM 
bi 62 • • • bm 

J = 

pi 62 ••• bM_ 
Recall that from properties (PI) and (P2) of J we have 



G J. 



D^rn)pjp-i ^ 0, for any m. 



Hence, 

£,(m)pjp-i ^.^.^ > 0, for any m and any y-.^st- (47) 
Prom (47) and Lemma 7.5, we can deduce that / = alu, for some nonnegative constant a. 

7.3. Proof of Theorem 5.5 

In order for R to produce consistent reconstructions, we need the following inequality (noting that 
V = V-i for any V G Q(m)): 

^('"VP^p-Vyinit > 0, for any F e Q(ro) and P e e?(m), (48) 

where A = FR. We first fix the sign-changing matrix V to be the identity matrix Im- Then the first L(m) 

rows of (48) exactly form the inequality (31). Since Lemmas 7.3, 7.4, and 7.5 are stated for Variant II initial 
codewords, it follows from Theorem 5.3 that A must be of the form gIm + J, where a > and J ^ J. 
Substituting in to (48), we obtain 

a5(")yi„it + &^WPJP-^Vy,^,t > 0. (49) 

Now we show that J = by contradiction. Suppose all entries in column i of J are 6i, for 1 < i < M. 
Consider the following cases: 

1. If bi is negative, choose V = P = Im and j/init = (mi, M2, • • • , Mm) compatible with composition 
m= (1,1,...,!). Consider the last row of inequality (49): 

M 
i=2 

Since M > 3, M — 1 ^ 1. Therefore the scale associated with in the left hand side of inequality 
(50) is bi < 0. Hence, choosing /xi large enough certainly breaks inequality (50), and therefore violates 

inequality (49). 

2. If bi is positive, choosing P = Im, V = diag(— 1, 1, 1, . . . , 1) makes the first entry of the {M — l)th row 
of matrix VPJP~^V negative (note that M — 1 ^ 1 and the operator V{-)V first changes the signs 
of columns of the original matrix and then changes the signs of rows of the resulting matrix in the 
same manner). Repeating the argument in the first case we can break the last row of inequality (49) 
by appropriate choice of yinit- 

/ 1 £ \ 

3. If column ^ of J, 1 < f < M, is different from zero, choosing P = ( ^ j leads us to either case 1 
or case 2. 

Hence, 

A = FR = alM- (51) 

Equality (51) states that the row vectors of F and the column vectors of R form a biorthogonal basis pair of 
within a nonnegative scale factor. Since the number of vectors in each basis cannot exceed the dimension 
of the space, we can deduce M < N. On the other hand, M > N because P is a frame. Thus, M = N. 
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